Read and spontaneous speech classification based on variance of GMM supervectors

نویسندگان

Taichi Asami

Ryo Masumura

Hirokazu Masataki

Sumitaka Sakauchi

چکیده

This paper provides a novel method to classify spoken utterances into reading style or spontaneous style. Read/spontaneous speech classification is important for extracting data to train acoustic models for speech recognition from real data in which read speech and spontaneous speech samples are mixed. We analyzed 23,900 reading and 31,988 spontaneous utterances of 30 speakers and found that variance of GMM supervectors in several consecutive utterances can discriminate the reading and spontaneous styles and has less speaker-dependency. Based on this knowledge, our method uses variance of GMM supervectors to classify unknown consecutive utterances into reading style or spontaneous style. Experiments show that our technique can classify 5 consecutive utterances of unknown speakers with over 95% accuracy without any other lexical, phonetic, or prosodic features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Diarization Based on Gmm Supervectors and Unsupervised Intra-speaker Variability Modeling

This paper presents a novel framework for speaker diarization. Audio is parameterized by a sequence of GMM-supervectors representing overlapping short segments of speech. Session dependent intra-session intra-speaker variability is estimated online in an unsupervised manner, and is removed from the supervectors using Nuisance Attribute Projection (NAP) The supervectors are then projected using ...

متن کامل

Combining five acoustic level modeling methods for automatic speaker age and gender recognition

This paper presents a novel automatic speaker age and gender identification approach which combines five different methods at the acoustic level to improve the baseline performance. The five subsystems are (1) Gaussian mixture model (GMM) system based on mel-frequency cepstral coefficient (MFCC) features, (2) Support vector machine (SVM) based on GMM mean supervectors, (3) SVM based on GMM maxi...

متن کامل

Automatic speaker age and gender recognition using acoustic and prosodic level information fusion

The paper presents a novel automatic speaker age and gender identification approach which combines seven different methods t both acoustic and prosodic levels to improve the baseline performance. The three baseline subsystems are (1) Gaussian mixture odel (GMM) based on mel-frequency cepstral coefficient (MFCC) features, (2) Support vector machine (SVM) based on GMM ean supervectors and (3) SVM...

متن کامل

An Integrated Solution for Snoring Sound Classification Using Bhattacharyya Distance Based GMM Supervectors with SVM, Feature Selection with Random Forest and Spectrogram with CNN

Snoring is caused by the narrowing of the upper airway and it is excited by different locations within the upper airways. This irregularity could lead to the presence of Obstructive Sleep Apnea Syndrome (OSAS). Diagnosis of OSAS could therefore be made by snoring sound analysis. This paper proposes the novel method to automatically classify snoring sounds by their excitation locations for ComPa...

متن کامل

Exploiting supervector structure for speaker recognition trained on a small development set

Nowadays state-of-the-art speaker recognition systems obtain quite satisfactory results for both text-independent and textdependent tasks as long as they are trained on a fair amount of development data from the target domain (assuming clean speech). In this work, we investigate the ability to build accurate speaker recognition systems using small amounts of data from the target domain without ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Read and spontaneous speech classification based on variance of GMM supervectors

نویسندگان

چکیده

منابع مشابه

Speaker Diarization Based on Gmm Supervectors and Unsupervised Intra-speaker Variability Modeling

Combining five acoustic level modeling methods for automatic speaker age and gender recognition

Automatic speaker age and gender recognition using acoustic and prosodic level information fusion

An Integrated Solution for Snoring Sound Classification Using Bhattacharyya Distance Based GMM Supervectors with SVM, Feature Selection with Random Forest and Spectrogram with CNN

Exploiting supervector structure for speaker recognition trained on a small development set

عنوان ژورنال:

اشتراک گذاری